Estimation of Speaker’s Height and Speech Sign
نویسنده
چکیده
Estimation of speaker’s height and vocal tract length (VTL) from speech signal can have forensic and automatic speech recognition applications. It was suggested for a long time that there is a correlation between speaker’s VTL, on one side, and speaker’s height and formant frequencies, on another side. Until recently, these putative relationships have been empirically examined in studies employing relatively small numbers of speakers. Scattered studies presented intriguing results about the correlations between speaker’s height and various acoustic speech parameters. Due to lack of databases, few studies presented extensive comparative results between the actual speaker’s VTL and the estimated one from speech signal. This paper presents an analysis of correlations between various acoustic speech parameters and speaker’s height for a large number of speakers. It also presents a new method for an optimal estimation of speaker’s height and VTL from various acoustic speech parameters.
منابع مشابه
Efficient Pitch-based Estimation o
To reduce inter-speaker variability, vocal tract length normalization (VTLN) is commonly used to transform acoustic features for automatic speech recognition (ASR). The warp factors used in this process are usually derived by maximum likelihood (ML) estimation, involving an exhaustive search over possible values. We describe an alternative approach: exploit the correlation between a speaker’s a...
متن کاملSpeaker adaptation in noisy environments based on parameter estimation using uncertain data
This paper describes new method for the speaker adaptation of HMM parameters in environments with background noise. This method is based on Bayesian estimation, and calculates the a posteriori distribution of cleanspeech HMM parameters from their a priori distribution by using noisy speech observations. The advantage of the method is that the distribution of the noise can be taken into account ...
متن کاملEstimating the Stability and Dispersion of the Biometric Glottal Fingerprint in Continuous Speech
The speaker’s biometric voice fingerprint may be derived from voice as a whole, or from the vocal tract and glottal signals, after separation by inverse filtering. This last approach has been used by the authors in early work, where it has been shown that the biometric fingerprint obtained from the glottal source or related speech residuals gives a good description of the speaker’s identity and...
متن کاملAn Acoustic Study of Emotivity-Prosody Interface in Persian Speech Using the Tilt Model
This paper aims to explore some acoustic properties (i.e. duration and pitch amplitude of speech) associated with three different emotions: anger, sadness and joy against neutrality as a reference point, all being intentionally expressed by six Persian speakers. The primary purpose of this study is to find out if there is any correspondence between the given emotions and prosody patterning in P...
متن کاملStatistical Study of Speaker’s Peculiarities of Utterances into Phrases Segmentation
The report is concerned with the experimental study of the idiosyncrasy of utterance-into-phrase segmentation observed in the speech of a popular Russian TV-anchorman and two TV-news readers. Comparative statistical estimation of relative frequencies of occurrence of pauses of various duration, frequencies of occurrence of phrases and pairs of phrases with a different number of accent units wer...
متن کامل